Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Parallel high utility pattern mining algorithm based on cluster partition

XING Shuning, LIU Fang'ai, ZHAO Xiaohui

Journal of Computer Applications 2016, 36 (8): 2202-2206. DOI: 10.11772/j.issn.1001-9081.2016.08.2202

Abstract （493）

PDF （844KB）（349）

Save

The exiting algorithms generate a lot of utility pattern trees based on memory when mining high utility patterns in large-scale database, leading to occupying more memory spaces and losing some high utility itemsets. Using Hadoop platform, a parallel high utility pattern mining algorithm, named PUCP, based on cluster partition was proposed. Firstly, the clustering method was introduced to divide the transaction database into several sub-datasets. Secondly, sub-datasets were allocated to each node of Hadoop to construct utility pattern tree. Finally, the conditional pattern bases of the same item which generated from utility pattern trees were allocated to the same node, reducing the crossover operation times of each node. The theoretical analysis and experimental results show that, compared with the mainstream serial high utility pattern mining algorithm named UP-Growth (Utility Pattern Growth) and parallel high utility pattern mining algorithm named HUI-Growth (Parallel mining High Utility Itemsets by pattern-Growth), the mining efficiency of PUCP is increased by 61.2% and 16.6% respectively without affecting the reliability of the mining results; and the memory pressure of large data mining can be effectively relieved by using Hadoop platform.

Reference | Related Articles | Metrics